Pesquisa | Portal Regional da BVS

1.

BioHackathon 2015: Semantics of data for life sciences and reproducible research.

Vos, Rutger A; Katayama, Toshiaki; Mishima, Hiroyuki; Kawano, Shin; Kawashima, Shuichi; Kim, Jin-Dong; Moriya, Yuki; Tokimatsu, Toshiaki; Yamaguchi, Atsuko; Yamamoto, Yasunori; Wu, Hongyan; Amstutz, Peter; Antezana, Erick; Aoki, Nobuyuki P; Arakawa, Kazuharu; Bolleman, Jerven T; Bolton, Evan; Bonnal, Raoul J P; Bono, Hidemasa; Burger, Kees; Chiba, Hirokazu; Cohen, Kevin B; Deutsch, Eric W; Fernández-Breis, Jesualdo T; Fu, Gang; Fujisawa, Takatomo; Fukushima, Atsushi; García, Alexander; Goto, Naohisa; Groza, Tudor; Hercus, Colin; Hoehndorf, Robert; Itaya, Kotone; Juty, Nick; Kawashima, Takeshi; Kim, Jee-Hyub; Kinjo, Akira R; Kotera, Masaaki; Kozaki, Kouji; Kumagai, Sadahiro; Kushida, Tatsuya; Lütteke, Thomas; Matsubara, Masaaki; Miyamoto, Joe; Mohsen, Attayeb; Mori, Hiroshi; Naito, Yuki; Nakazato, Takeru; Nguyen-Xuan, Jeremy; Nishida, Kozo.

F1000Res ; 9: 136, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32308977

RESUMO

We report on the activities of the 2015 edition of the BioHackathon, an annual event that brings together researchers and developers from around the world to develop tools and technologies that promote the reusability of biological data. We discuss issues surrounding the representation, publication, integration, mining and reuse of biological data and metadata across a wide range of biomedical data types of relevance for the life sciences, including chemistry, genotypes and phenotypes, orthology and phylogeny, proteomics, genomics, glycomics, and metabolomics. We describe our progress to address ongoing challenges to the reusability and reproducibility of research results, and identify outstanding issues that continue to impede the progress of bioinformatics research. We share our perspective on the state of the art, continued challenges, and goals for future research and development for the life sciences Semantic Web.

Assuntos

Disciplinas das Ciências Biológicas , Biologia Computacional , Web Semântica , Mineração de Dados , Metadados , Reprodutibilidade dos Testes

2.

Sharing Programming Resources Between Bio* Projects.

Bonnal, Raoul J P; Yates, Andrew; Goto, Naohisa; Gautier, Laurent; Willis, Scooter; Fields, Christopher; Katayama, Toshiaki; Prins, Pjotr.

Methods Mol Biol ; 1910: 747-766, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31278684

RESUMO

Open-source software encourages computer programmers to reuse software components written by others. In evolutionary bioinformatics, open-source software comes in a broad range of programming languages, including C/C++, Perl, Python, Ruby, Java, and R. To avoid writing the same functionality multiple times for different languages, it is possible to share components by bridging computer languages and Bio* projects, such as BioPerl, Biopython, BioRuby, BioJava, and R/Bioconductor.In this chapter, we compare the three principal approaches for sharing software between different programming languages: by remote procedure call (RPC), by sharing a local "call stack," and by calling program to programs. RPC provides a language-independent protocol over a network interface; examples are SOAP and Rserve. The local call stack provides a between-language mapping, not over the network interface but directly in computer memory; examples are R bindings, RPy, and languages sharing the Java virtual machine stack. This functionality provides strategies for sharing of software between Bio* projects, which can be exploited more often.Here, we present cross-language examples for sequence translation and measure throughput of the different options. We compare calling into R through native R, RSOAP, Rserve, and RPy interfaces, with the performance of native BioPerl, Biopython, BioJava, and BioRuby implementations and with call stack bindings to BioJava and the European Molecular Biology Open Software Suite (EMBOSS).In general, call stack approaches outperform native Bio* implementations, and these, in turn, outperform "RPC"-based approaches. To test and compare strategies, we provide a downloadable Docker container with all examples, tools, and libraries included.

Assuntos

Biologia Computacional , Linguagens de Programação , Software , Biologia Computacional/métodos , Interface Usuário-Computador , Navegador

3.

A new approach to identifying hypertension-associated genes in the mesenteric artery of spontaneously hypertensive rats and stroke-prone spontaneously hypertensive rats.

Ikawa, Takashi; Watanabe, Yuko; Okuzaki, Daisuke; Goto, Naohisa; Okamura, Nobutaka; Yamanishi, Kyosuke; Higashino, Toshihide; Yamanishi, Hiromichi; Okamura, Haruki; Higashino, Hideaki.

J Hypertens ; 37(8): 1644-1656, 2019 08.

Artigo em Inglês | MEDLINE | ID: mdl-30882592

RESUMO

OBJECTIVE: Hypertension is one of the most prevalent diseases in humans who live a modern lifestyle. Alongside more effective care, clarification of the genetic background of hypertension is urgently required. Gene expression in mesenteric resistance arteries of spontaneously hypertensive rats (SHR), stroke-prone SHR (SHRSP) and two types of renal hypertensive Wistar Kyoto rats (WKY), two kidneys and one clip renal hypertensive rat (2K1C) and one kidney and one clip renal hypertensive rat (1K1C), was compared using DNA microarrays. METHODS: We used a simultaneous equation and comparative selection method to identify genes associated with hypertension using the Reactome analysis tool and GenBank database. RESULTS: The expression of 298 genes was altered between SHR and WKY (44 upregulated and 254 downregulated), while the expression of 290 genes was altered between SHRSP and WKY (83 upregulated and 207 downregulated). For SHRSP versus SHR, the expression of 60 genes was altered (36 upregulated and 24 downregulated). Several genes expressed in SHR and SHRSP were also expressed in the renovascular hypertensive 2K1C and 1K1C rats, indicative of the existence of hyper-renin and/or hypervolemic pathophysiological changes in SHR and SHRSP. CONCLUSION: The overexpression of Kcnq1, Crlf1, Alb and Xirp1 and the inhibition of Galr2, Kcnh1, Ache, Chrm2 and Slc5a7 expression may indicate that a relationship exists between these genes and the cause and/or worsening of hypertension in SHR and SHRSP.

Assuntos

Hipertensão , Artérias Mesentéricas , Transcriptoma/genética , Animais , Perfilação da Expressão Gênica , Hipertensão/genética , Hipertensão/metabolismo , Artérias Mesentéricas/química , Artérias Mesentéricas/metabolismo , Ratos , Ratos Endogâmicos SHR , Ratos Endogâmicos WKY

4.

Fungal ITS1 Deep-Sequencing Strategies to Reconstruct the Composition of a 26-Species Community and Evaluation of the Gut Mycobiota of Healthy Japanese Individuals.

Motooka, Daisuke; Fujimoto, Kosuke; Tanaka, Reiko; Yaguchi, Takashi; Gotoh, Kazuyoshi; Maeda, Yuichi; Furuta, Yoki; Kurakawa, Takashi; Goto, Naohisa; Yasunaga, Teruo; Narazaki, Masashi; Kumanogoh, Atsushi; Horii, Toshihiro; Iida, Tetsuya; Takeda, Kiyoshi; Nakamura, Shota.

Front Microbiol ; 8: 238, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28261190

RESUMO

The study of mycobiota remains relatively unexplored due to the lack of sufficient available reference strains and databases compared to those of bacterial microbiome studies. Deep sequencing of Internal Transcribed Spacer (ITS) regions is the de facto standard for fungal diversity analysis. However, results are often biased because of the wide variety of sequence lengths in the ITS regions and the complexity of high-throughput sequencing (HTS) technologies. In this study, a curated ITS database, ntF-ITS1, was constructed. This database can be utilized for the taxonomic assignment of fungal community members. We evaluated the efficacy of strategies for mycobiome analysis by using this database and characterizing a mock fungal community consisting of 26 species representing 15 genera using ITS1 sequencing with three HTS platforms: Illumina MiSeq (MiSeq), Ion Torrent Personal Genome Machine (IonPGM), and Pacific Biosciences (PacBio). Our evaluation demonstrated that PacBio's circular consensus sequencing with greater than 8 full-passes most accurately reconstructed the composition of the mock community. Using this strategy for deep-sequencing analysis of the gut mycobiota in healthy Japanese individuals revealed two major mycobiota types: a single-species type composed of Candida albicans or Saccharomyces cerevisiae and a multi-species type. In this study, we proposed the best possible processing strategies for the three sequencing platforms, of which, the PacBio platform allowed for the most accurate estimation of the fungal community. The database and methodology described here provide critical tools for the emerging field of mycobiome studies.

5.

Performance comparison of second- and third-generation sequencers using a bacterial genome with two chromosomes.

Miyamoto, Mari; Motooka, Daisuke; Gotoh, Kazuyoshi; Imai, Takamasa; Yoshitake, Kazutoshi; Goto, Naohisa; Iida, Tetsuya; Yasunaga, Teruo; Horii, Toshihiro; Arakawa, Kazuharu; Kasahara, Masahiro; Nakamura, Shota.

BMC Genomics ; 15: 699, 2014 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-25142801

RESUMO

BACKGROUND: The availability of diverse second- and third-generation sequencing technologies enables the rapid determination of the sequences of bacterial genomes. However, identifying the sequencing technology most suitable for producing a finished genome with multiple chromosomes remains a challenge. We evaluated the abilities of the following three second-generation sequencers: Roche 454 GS Junior (GS Jr), Life Technologies Ion PGM (Ion PGM), and Illumina MiSeq (MiSeq) and a third-generation sequencer, the Pacific Biosciences RS sequencer (PacBio), by sequencing and assembling the genome of Vibrio parahaemolyticus, which consists of a 5-Mb genome comprising two circular chromosomes. RESULTS: We sequenced the genome of V. parahaemolyticus with GS Jr, Ion PGM, MiSeq, and PacBio and performed de novo assembly with several genome assemblers. Although GS Jr generated the longest mean read length of 418 bp among the second-generation sequencers, the maximum contig length of the best assembly from GS Jr was 165 kbp, and the number of contigs was 309. Single runs of Ion PGM and MiSeq produced data of considerably greater sequencing coverage, 279× and 1,927×, respectively. The optimized result for Ion PGM contained 61 contigs assembled from reads of 77× coverage, and the longest contig was 895 kbp in size. Those for MiSeq were 34 contigs, 58× coverage, and 733 kbp, respectively. These results suggest that higher coverage depth is unnecessary for a better assembly result. We observed that multiple rRNA coding regions were fragmented in the assemblies from the second-generation sequencers, whereas PacBio generated two exceptionally long contigs of 3,288,561 and 1,875,537 bps, each of which was from a single chromosome, with 73× coverage and mean read length 3,119 bp, allowing us to determine the absolute positions of all rRNA operons. CONCLUSIONS: PacBio outperformed the other sequencers in terms of the length of contigs and reconstructed the greatest portion of the genome, achieving a genome assembly of "finished grade" because of its long reads. It showed the potential to assemble more complex genomes with multiple chromosomes containing more repetitive sequences.

Assuntos

Cromossomos Bacterianos/genética , Genômica/métodos , Análise de Sequência/métodos , Genoma Bacteriano/genética , Sequenciamento de Nucleotídeos em Larga Escala , Vibrio parahaemolyticus/genética

6.

BioHackathon series in 2011 and 2012: penetration of ontology and linked data in life science domains.

Katayama, Toshiaki; Wilkinson, Mark D; Aoki-Kinoshita, Kiyoko F; Kawashima, Shuichi; Yamamoto, Yasunori; Yamaguchi, Atsuko; Okamoto, Shinobu; Kawano, Shin; Kim, Jin-Dong; Wang, Yue; Wu, Hongyan; Kano, Yoshinobu; Ono, Hiromasa; Bono, Hidemasa; Kocbek, Simon; Aerts, Jan; Akune, Yukie; Antezana, Erick; Arakawa, Kazuharu; Aranda, Bruno; Baran, Joachim; Bolleman, Jerven; Bonnal, Raoul Jp; Buttigieg, Pier Luigi; Campbell, Matthew P; Chen, Yi-An; Chiba, Hirokazu; Cock, Peter Ja; Cohen, K Bretonnel; Constantin, Alexandru; Duck, Geraint; Dumontier, Michel; Fujisawa, Takatomo; Fujiwara, Toyofumi; Goto, Naohisa; Hoehndorf, Robert; Igarashi, Yoshinobu; Itaya, Hidetoshi; Ito, Maori; Iwasaki, Wataru; Kalas, Matús; Katoda, Takeo; Kim, Taehong; Kokubu, Anna; Komiyama, Yusuke; Kotera, Masaaki; Laibe, Camille; Lapp, Hilmar; Lütteke, Thomas; Marshall, M Scott.

J Biomed Semantics ; 5(1): 5, 2014 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-24495517

RESUMO

The application of semantic technologies to the integration of biological data and the interoperability of bioinformatics analysis and visualization tools has been the common theme of a series of annual BioHackathons hosted in Japan for the past five years. Here we provide a review of the activities and outcomes from the BioHackathons held in 2011 in Kyoto and 2012 in Toyama. In order to efficiently implement semantic technologies in the life sciences, participants formed various sub-groups and worked on the following topics: Resource Description Framework (RDF) models for specific domains, text mining of the literature, ontology development, essential metadata for biological databases, platforms to enable efficient Semantic Web technology development and interoperability, and the development of applications for Semantic Web data. In this review, we briefly introduce the themes covered by these sub-groups. The observations made, conclusions drawn, and software development projects that emerged from these activities are discussed.

7.

Genetic analysis of genes causing hypertension and stroke in spontaneously hypertensive rats.

Yamamoto, Hideyuki; Okuzaki, Daisuke; Yamanishi, Kyosuke; Xu, Yunfeng; Watanabe, Yuko; Yoshida, Momoko; Yamashita, Akifumi; Goto, Naohisa; Nishiguchi, Seiji; Shimada, Kazunori; Nojima, Hiroshi; Yasunaga, Teruo; Okamura, Haruki; Matsunaga, Hisato; Yamanishi, Hiromichi.

Int J Mol Med ; 31(5): 1057-65, 2013 May.

Artigo em Inglês | MEDLINE | ID: mdl-23525202

RESUMO

Spontaneously hypertensive rats (SHR) and stroke-prone SHR (SHRSP) are frequently used as model rats not only in studies of essential hypertension and stroke, but also in studies of attention deficit hyperactivity disorder (ADHD). Normotensive Wistar-Kyoto rats (WKY) are normally used as controls in these studies. In this study, using these rats, we aimed to identify the genes causing hypertension and stroke, as well as the genes involved in ADHD. Since adrenal gland products can directly influence cardiovascular, endocrine and sympathetic nervous system functions, gene expression profiles in the adrenal glands of the 3 rat strains were examined using genome-wide microarray technology when the rats were 3 and 6 weeks of age, a period in which the rats are considered to be in a pre-hypertensive state. Gene expression profiles were compared between SHR and WKY and between SHRSP and SHR. A total of 353 genes showing more than a 4-fold increase or less than a 4-fold decrease in expression were isolated and candidate genes were selected as significantly enriched genes. SHR-specific genes isolated when the rats were 3 weeks of age contained 12 enriched genes related to transcriptional regulatory activity and those isolated when the rats were 6 weeks of age contained 6 enriched genes related to the regulation of blood pressure. SHRSP-specific genes isolated when the rats were 3 weeks of age contained 4 enriched genes related to the regulation of blood pressure and those isolated when the rats were 6 weeks of age contained 4 enriched genes related to the response to steroid hormone stimulus. Ingenuity pathway analysis of enriched SHR-specific genes revealed that 2 transcriptional regulators, cAMP responsive element modulator (Crem) and Fos-like antigen 1 (Fosl1), interact with blood pressure-regulating genes, such as neurotensin (Nts), apelin (Apln) and epoxide hydrolase 2, cytoplasmic (Ephx2). Similar analyses of SHRSP-specific genes revealed that angiotensinogen (Agt), one of the blood pressure-regulating genes, plays pivotal roles among SHRSP-specific genes. Moreover, genes associated with ADHD, such as low density lipoprotein receptor (Ldlr) and Crem, are discussed.

Assuntos

Predisposição Genética para Doença , Hipertensão/complicações , Hipertensão/genética , Acidente Vascular Cerebral/complicações , Acidente Vascular Cerebral/genética , Envelhecimento/genética , Envelhecimento/patologia , Animais , Epistasia Genética , Redes Reguladoras de Genes/genética , Ratos , Ratos Endogâmicos SHR

8.

The 3rd DBCLS BioHackathon: improving life science data integration with Semantic Web technologies.

Katayama, Toshiaki; Wilkinson, Mark D; Micklem, Gos; Kawashima, Shuichi; Yamaguchi, Atsuko; Nakao, Mitsuteru; Yamamoto, Yasunori; Okamoto, Shinobu; Oouchida, Kenta; Chun, Hong-Woo; Aerts, Jan; Afzal, Hammad; Antezana, Erick; Arakawa, Kazuharu; Aranda, Bruno; Belleau, Francois; Bolleman, Jerven; Bonnal, Raoul Jp; Chapman, Brad; Cock, Peter Ja; Eriksson, Tore; Gordon, Paul Mk; Goto, Naohisa; Hayashi, Kazuhiro; Horn, Heiko; Ishiwata, Ryosuke; Kaminuma, Eli; Kasprzyk, Arek; Kawaji, Hideya; Kido, Nobuhiro; Kim, Young Joo; Kinjo, Akira R; Konishi, Fumikazu; Kwon, Kyung-Hoon; Labarga, Alberto; Lamprecht, Anna-Lena; Lin, Yu; Lindenbaum, Pierre; McCarthy, Luke; Morita, Hideyuki; Murakami, Katsuhiko; Nagao, Koji; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Ono, Keiichiro; Oshita, Kazuki; Park, Keun-Joon; Prins, Pjotr.

J Biomed Semantics ; 4(1): 6, 2013 Feb 11.

Artigo em Inglês | MEDLINE | ID: mdl-23398680

RESUMO

BACKGROUND: BioHackathon 2010 was the third in a series of meetings hosted by the Database Center for Life Sciences (DBCLS) in Tokyo, Japan. The overall goal of the BioHackathon series is to improve the quality and accessibility of life science research data on the Web by bringing together representatives from public databases, analytical tool providers, and cyber-infrastructure researchers to jointly tackle important challenges in the area of in silico biological research. RESULTS: The theme of BioHackathon 2010 was the 'Semantic Web', and all attendees gathered with the shared goal of producing Semantic Web data from their respective resources, and/or consuming or interacting those data using their tools and interfaces. We discussed on topics including guidelines for designing semantic data and interoperability of resources. We consequently developed tools and clients for analysis and visualization. CONCLUSION: We provide a meeting report from BioHackathon 2010, in which we describe the discussions, decisions, and breakthroughs made as we moved towards compliance with Semantic Web technologies - from source provider, through middleware, to the end-consumer.

9.

A comprehensive analysis of reassortment in influenza A virus.

de Silva, U Chandimal; Tanaka, Hokuto; Nakamura, Shota; Goto, Naohisa; Yasunaga, Teruo.

Biol Open ; 1(4): 385-90, 2012 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-23213428

RESUMO

Genetic reassortment plays a vital role in the evolution of the influenza virus and has historically been linked with the emergence of pandemic strains. Reassortment is believed to occur when a single host - typically swine - is simultaneously infected with multiple influenza strains. The reassorted viral strains with novel gene combinations tend to easily evade the immune system in other host species, satisfying the basic requirements of a virus with pandemic potential. Therefore, it is vital to continuously monitor the genetic content of circulating influenza strains and keep an eye out for new reassortants. We present a new approach to identify reassortants from large data sets of influenza whole genome nucleotide sequences and report the results of the first ever comprehensive search for reassortants of all published influenza A genomic data. 35 of the 52 well supported candidate reassortants we found are reported here for the first time while our analysis method offers new insight that enables us to draw a more detailed picture of the origin of some of the previously reported reassortants. A disproportionately high number (13/52) of the candidate reassortants found were the result of the introduction of novel hemagglutinin and/or neuraminidase genes into a previously circulating virus. The method described in this paper may contribute towards automating the task of routinely searching for reassortants among newly sequenced strains.

10.

Plasmodium cynomolgi genome sequences provide insight into Plasmodium vivax and the monkey malaria clade.

Tachibana, Shin-Ichiro; Sullivan, Steven A; Kawai, Satoru; Nakamura, Shota; Kim, Hyunjae R; Goto, Naohisa; Arisue, Nobuko; Palacpac, Nirianne M Q; Honma, Hajime; Yagi, Masanori; Tougan, Takahiro; Katakai, Yuko; Kaneko, Osamu; Mita, Toshihiro; Kita, Kiyoshi; Yasutomi, Yasuhiro; Sutton, Patrick L; Shakhbatyan, Rimma; Horii, Toshihiro; Yasunaga, Teruo; Barnwell, John W; Escalante, Ananias A; Carlton, Jane M; Tanabe, Kazuyuki.

Nat Genet ; 44(9): 1051-5, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22863735

RESUMO

P. cynomolgi, a malaria-causing parasite of Asian Old World monkeys, is the sister taxon of P. vivax, the most prevalent malaria-causing species in humans outside of Africa. Because P. cynomolgi shares many phenotypic, biological and genetic characteristics with P. vivax, we generated draft genome sequences for three P. cynomolgi strains and performed genomic analysis comparing them with the P. vivax genome, as well as with the genome of a third previously sequenced simian parasite, Plasmodium knowlesi. Here, we show that genomes of the monkey malaria clade can be characterized by copy-number variants (CNVs) in multigene families involved in evasion of the human immune system and invasion of host erythrocytes. We identify genome-wide SNPs, microsatellites and CNVs in the P. cynomolgi genome, providing a map of genetic variation that can be used to map parasite traits and study parasite populations. The sequencing of the P. cynomolgi genome is a critical step in developing a model system for P. vivax research and in counteracting the neglect of P. vivax.

Assuntos

Genoma de Protozoário , Haplorrinos/parasitologia , Doenças dos Macacos/parasitologia , Plasmodium cynomolgi/genética , Plasmodium vivax/genética , Animais , Sequência de Bases , Análise por Conglomerados , Genes de Protozoários , Genoma de Protozoário/genética , Malária/genética , Malária/parasitologia , Modelos Genéticos , Dados de Sequência Molecular , Doenças dos Macacos/classificação , Doenças dos Macacos/genética , Filogenia , Plasmodium cynomolgi/classificação , Plasmodium vivax/classificação , Análise de Sequência de DNA

11.

Sharing programming resources between Bio* projects through remote procedure call and native call stack strategies.

Prins, Pjotr; Goto, Naohisa; Yates, Andrew; Gautier, Laurent; Willis, Scooter; Fields, Christopher; Katayama, Toshiaki.

Methods Mol Biol ; 856: 513-27, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-22399473

RESUMO

Open-source software (OSS) encourages computer programmers to reuse software components written by others. In evolutionary bioinformatics, OSS comes in a broad range of programming languages, including C/C++, Perl, Python, Ruby, Java, and R. To avoid writing the same functionality multiple times for different languages, it is possible to share components by bridging computer languages and Bio* projects, such as BioPerl, Biopython, BioRuby, BioJava, and R/Bioconductor. In this chapter, we compare the two principal approaches for sharing software between different programming languages: either by remote procedure call (RPC) or by sharing a local call stack. RPC provides a language-independent protocol over a network interface; examples are RSOAP and Rserve. The local call stack provides a between-language mapping not over the network interface, but directly in computer memory; examples are R bindings, RPy, and languages sharing the Java Virtual Machine stack. This functionality provides strategies for sharing of software between Bio* projects, which can be exploited more often. Here, we present cross-language examples for sequence translation, and measure throughput of the different options. We compare calling into R through native R, RSOAP, Rserve, and RPy interfaces, with the performance of native BioPerl, Biopython, BioJava, and BioRuby implementations, and with call stack bindings to BioJava and the European Molecular Biology Open Software Suite. In general, call stack approaches outperform native Bio* implementations and these, in turn, outperform RPC-based approaches. To test and compare strategies, we provide a downloadable BioNode image with all examples, tools, and libraries included. The BioNode image can be run on VirtualBox-supported operating systems, including Windows, OSX, and Linux.

Assuntos

Biologia Computacional/métodos , Software , Linguagens de Programação

12.

Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics.

Bonnal, Raoul J P; Aerts, Jan; Githinji, George; Goto, Naohisa; MacLean, Dan; Miller, Chase A; Mishima, Hiroyuki; Pagani, Massimiliano; Ramirez-Gonzalez, Ricardo; Smant, Geert; Strozzi, Francesco; Syme, Rob; Vos, Rutger; Wennblom, Trevor J; Woodcroft, Ben J; Katayama, Toshiaki; Prins, Pjotr.

Bioinformatics ; 28(7): 1035-7, 2012 Apr 01.

Artigo em Inglês | MEDLINE | ID: mdl-22332238

RESUMO

SUMMARY: Biogem provides a software development environment for the Ruby programming language, which encourages community-based software development for bioinformatics while lowering the barrier to entry and encouraging best practices. Biogem, with its targeted modular and decentralized approach, software generator, tools and tight web integration, is an improved general model for scaling up collaborative open source software development in bioinformatics. AVAILABILITY: Biogem and modules are free and are OSS. Biogem runs on all systems that support recent versions of Ruby, including Linux, Mac OS X and Windows. Further information at http://www.biogems.info. A tutorial is available at http://www.biogems.info/howto.html CONTACT: bonnal@ingm.org.

Assuntos

Biologia Computacional/métodos , Internet , Linguagens de Programação , Software

13.

The 2nd DBCLS BioHackathon: interoperable bioinformatics Web services for integrated applications.

Katayama, Toshiaki; Wilkinson, Mark D; Vos, Rutger; Kawashima, Takeshi; Kawashima, Shuichi; Nakao, Mitsuteru; Yamamoto, Yasunori; Chun, Hong-Woo; Yamaguchi, Atsuko; Kawano, Shin; Aerts, Jan; Aoki-Kinoshita, Kiyoko F; Arakawa, Kazuharu; Aranda, Bruno; Bonnal, Raoul Jp; Fernández, José M; Fujisawa, Takatomo; Gordon, Paul Mk; Goto, Naohisa; Haider, Syed; Harris, Todd; Hatakeyama, Takashi; Ho, Isaac; Itoh, Masumi; Kasprzyk, Arek; Kido, Nobuhiro; Kim, Young-Joo; Kinjo, Akira R; Konishi, Fumikazu; Kovarskaya, Yulia; von Kuster, Greg; Labarga, Alberto; Limviphuvadh, Vachiranee; McCarthy, Luke; Nakamura, Yasukazu; Nam, Yunsun; Nishida, Kozo; Nishimura, Kunihiro; Nishizawa, Tatsuya; Ogishima, Soichi; Oinn, Tom; Okamoto, Shinobu; Okuda, Shujiro; Ono, Keiichiro; Oshita, Kazuki; Park, Keun-Joon; Putnam, Nicholas; Senger, Martin; Severin, Jessica; Shigemoto, Yasumasa.

J Biomed Semantics ; 2: 4, 2011 Aug 02.

Artigo em Inglês | MEDLINE | ID: mdl-21806842

RESUMO

BACKGROUND: The interaction between biological researchers and the bioinformatics tools they use is still hampered by incomplete interoperability between such tools. To ensure interoperability initiatives are effectively deployed, end-user applications need to be aware of, and support, best practices and standards. Here, we report on an initiative in which software developers and genome biologists came together to explore and raise awareness of these issues: BioHackathon 2009. RESULTS: Developers in attendance came from diverse backgrounds, with experts in Web services, workflow tools, text mining and visualization. Genome biologists provided expertise and exemplar data from the domains of sequence and pathway analysis and glyco-informatics. One goal of the meeting was to evaluate the ability to address real world use cases in these domains using the tools that the developers represented. This resulted in i) a workflow to annotate 100,000 sequences from an invertebrate species; ii) an integrated system for analysis of the transcription factor binding sites (TFBSs) enriched based on differential gene expression data obtained from a microarray experiment; iii) a workflow to enumerate putative physical protein interactions among enzymes in a metabolic pathway using protein structure data; iv) a workflow to analyze glyco-gene-related diseases by searching for human homologs of glyco-genes in other species, such as fruit flies, and retrieving their phenotype-annotated SNPs. CONCLUSIONS: Beyond deriving prototype solutions for each use-case, a second major purpose of the BioHackathon was to highlight areas of insufficiency. We discuss the issues raised by our exploration of the problem/solution space, concluding that there are still problems with the way Web services are modeled and annotated, including: i) the absence of several useful data or analysis functions in the Web service "space"; ii) the lack of documentation of methods; iii) lack of compliance with the SOAP/WSDL specification among and between various programming-language libraries; and iv) incompatibility between various bioinformatics data formats. Although it was still difficult to solve real world problems posed to the developers by the biological researchers in attendance because of these problems, we note the promise of addressing these issues within a semantic framework.

14.

BioRuby: bioinformatics software for the Ruby programming language.

Goto, Naohisa; Prins, Pjotr; Nakao, Mitsuteru; Bonnal, Raoul; Aerts, Jan; Katayama, Toshiaki.

Bioinformatics ; 26(20): 2617-9, 2010 Oct 15.

Artigo em Inglês | MEDLINE | ID: mdl-20739307

RESUMO

SUMMARY: The BioRuby software toolkit contains a comprehensive set of free development tools and libraries for bioinformatics and molecular biology, written in the Ruby programming language. BioRuby has components for sequence analysis, pathway analysis, protein modelling and phylogenetic analysis; it supports many widely used data formats and provides easy access to databases, external programs and public web services, including BLAST, KEGG, GenBank, MEDLINE and GO. BioRuby comes with a tutorial, documentation and an interactive environment, which can be used in the shell, and in the web browser. AVAILABILITY: BioRuby is free and open source software, made available under the Ruby license. BioRuby runs on all platforms that support Ruby, including Linux, Mac OS X and Windows. And, with JRuby, BioRuby runs on the Java Virtual Machine. The source code is available from http://www.bioruby.org/. CONTACT: katayama@bioruby.org

Assuntos

Linguagens de Programação , Software , Biologia Computacional , Bases de Dados Factuais , MEDLINE , Filogenia , Análise de Sequência de Proteína

15.

The DBCLS BioHackathon: standardization and interoperability for bioinformatics web services and workflows. The DBCLS BioHackathon Consortium*.

Katayama, Toshiaki; Arakawa, Kazuharu; Nakao, Mitsuteru; Ono, Keiichiro; Aoki-Kinoshita, Kiyoko F; Yamamoto, Yasunori; Yamaguchi, Atsuko; Kawashima, Shuichi; Chun, Hong-Woo; Aerts, Jan; Aranda, Bruno; Barboza, Lord Hendrix; Bonnal, Raoul Jp; Bruskiewich, Richard; Bryne, Jan C; Fernández, José M; Funahashi, Akira; Gordon, Paul Mk; Goto, Naohisa; Groscurth, Andreas; Gutteridge, Alex; Holland, Richard; Kano, Yoshinobu; Kawas, Edward A; Kerhornou, Arnaud; Kibukawa, Eri; Kinjo, Akira R; Kuhn, Michael; Lapp, Hilmar; Lehvaslaiho, Heikki; Nakamura, Hiroyuki; Nakamura, Yasukazu; Nishizawa, Tatsuya; Nobata, Chikashi; Noguchi, Tamotsu; Oinn, Thomas M; Okamoto, Shinobu; Owen, Stuart; Pafilis, Evangelos; Pocock, Matthew; Prins, Pjotr; Ranzinger, René; Reisinger, Florian; Salwinski, Lukasz; Schreiber, Mark; Senger, Martin; Shigemoto, Yasumasa; Standley, Daron M; Sugawara, Hideaki; Tashiro, Toshiyuki.

J Biomed Semantics ; 1(1): 8, 2010 Aug 21.

Artigo em Inglês | MEDLINE | ID: mdl-20727200

RESUMO

Web services have become a key technology for bioinformatics, since life science databases are globally decentralized and the exponential increase in the amount of available data demands for efficient systems without the need to transfer entire databases for every step of an analysis. However, various incompatibilities among database resources and analysis services make it difficult to connect and integrate these into interoperable workflows. To resolve this situation, we invited domain specialists from web service providers, client software developers, Open Bio* projects, the BioMoby project and researchers of emerging areas where a standard exchange data format is not well established, for an intensive collaboration entitled the BioHackathon 2008. The meeting was hosted by the Database Center for Life Science (DBCLS) and Computational Biology Research Center (CBRC) and was held in Tokyo from February 11th to 15th, 2008. In this report we highlight the work accomplished and the common issues arisen from this event, including the standardization of data exchange formats and services in the emerging fields of glycoinformatics, biological interaction networks, text mining, and phyloinformatics. In addition, common shared object development based on BioSQL, as well as technical challenges in large data management, asynchronous services, and security are discussed. Consequently, we improved interoperability of web services in several fields, however, further cooperation among major database centers and continued collaborative efforts between service providers and software developers are still necessary for an effective advance in bioinformatics web service technologies.

16.

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.

Cock, Peter J A; Fields, Christopher J; Goto, Naohisa; Heuer, Michael L; Rice, Peter M.

Nucleic Acids Res ; 38(6): 1767-71, 2010 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-20015970

RESUMO

FASTQ has emerged as a common file format for sharing sequencing read data combining both the sequence and an associated per base quality score, despite lacking any formal definition to date, and existing in at least three incompatible variants. This article defines the FASTQ format, covering the original Sanger standard, the Solexa/Illumina variants and conversion between them, based on publicly available information such as the MAQ documentation and conventions recently agreed by the Open Bioinformatics Foundation projects Biopython, BioPerl, BioRuby, BioJava and EMBOSS. Being an open access publication, it is hoped that this description, with the example files provided as Supplementary Data, will serve in future as a reference for this important file format.

Assuntos

Análise de Sequência de DNA , Software , Biologia Computacional/história , História do Século XX , História do Século XXI , Análise de Sequência de DNA/história , Análise de Sequência de DNA/normas

17.

Inhibition of influenza virus infection by targeting genome conserved region with non-natural nucleic acid.

Takahashi, Tomoya; Ohzawa, Takuya; Sawada, Shinjiro; Kato, Nobuo; Goto, Naohisa; Nakamura, Shota; Yasunaga, Teruo; Kaihatsu, Kunihiro.

Nucleic Acids Symp Ser (Oxf) ; (53): 285-6, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19749372

RESUMO

Two highly conserved 15 base sequences of influenza A virus genome were identified by CONSERV software, which can detect contiguous conserved sequences of biological sequences. Antiviral effect of phosphorothioate oligonucleotide that target these conserved sequences was evaluated by plaque formation assay and cell viability assay. Pre-treatment of cells with anti-PB2 (RNA polymerase subunit) reduced the size of plaques in a sequence dependent manner. Post-treatment of cells with anti-PB2 phosphorothioate oligonucleotide inhibited the virus-induced cytotoxicity.

Assuntos

Vírus da Influenza A/genética , Oligonucleotídeos Fosforotioatos , Animais , Sequência de Bases , Linhagem Celular , Sequência Conservada , Cães , Genoma Viral , Vírus da Influenza A/crescimento & desenvolvimento , Oligonucleotídeos Fosforotioatos/química , RNA Viral/química , RNA Polimerase Dependente de RNA/antagonistas & inibidores , RNA Polimerase Dependente de RNA/genética , Ensaio de Placa Viral , Proteínas Virais/antagonistas & inibidores , Proteínas Virais/genética

18.

Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach.

Nakamura, Shota; Yang, Cheng-Song; Sakon, Naomi; Ueda, Mayo; Tougan, Takahiro; Yamashita, Akifumi; Goto, Naohisa; Takahashi, Kazuo; Yasunaga, Teruo; Ikuta, Kazuyoshi; Mizutani, Tetsuya; Okamoto, Yoshiko; Tagami, Michihira; Morita, Ryoji; Maeda, Norihiro; Kawai, Jun; Hayashizaki, Yoshihide; Nagai, Yoshiyuki; Horii, Toshihiro; Iida, Tetsuya; Nakaya, Takaaki.

PLoS One ; 4(1): e4219, 2009.

Artigo em Inglês | MEDLINE | ID: mdl-19156205

RESUMO

With the severe acute respiratory syndrome epidemic of 2003 and renewed attention on avian influenza viral pandemics, new surveillance systems are needed for the earlier detection of emerging infectious diseases. We applied a "next-generation" parallel sequencing platform for viral detection in nasopharyngeal and fecal samples collected during seasonal influenza virus (Flu) infections and norovirus outbreaks from 2005 to 2007 in Osaka, Japan. Random RT-PCR was performed to amplify RNA extracted from 0.1-0.25 ml of nasopharyngeal aspirates (N = 3) and fecal specimens (N = 5), and more than 10 microg of cDNA was synthesized. Unbiased high-throughput sequencing of these 8 samples yielded 15,298-32,335 (average 24,738) reads in a single 7.5 h run. In nasopharyngeal samples, although whole genome analysis was not available because the majority (>90%) of reads were host genome-derived, 20-460 Flu-reads were detected, which was sufficient for subtype identification. In fecal samples, bacteria and host cells were removed by centrifugation, resulting in gain of 484-15,260 reads of norovirus sequence (78-98% of the whole genome was covered), except for one specimen that was under-detectable by RT-PCR. These results suggest that our unbiased high-throughput sequencing approach is useful for directly detecting pathogenic viruses without advance genetic information. Although its cost and technological availability make it unlikely that this system will very soon be the diagnostic standard worldwide, this system could be useful for the earlier discovery of novel emerging viruses and bioterrorism, which are difficult to detect with conventional procedures.

Assuntos

Fezes/virologia , Nariz/virologia , RNA Viral/metabolismo , Análise de Sequência de DNA/métodos , Sequência de Bases , DNA Bacteriano/metabolismo , Fezes/química , Gastroenterite/diagnóstico , Gastroenterite/virologia , Técnicas Genéticas , Humanos , Influenza Humana/diagnóstico , Influenza Humana/virologia , Dados de Sequência Molecular , Norovirus/genética , Orthomyxoviridae/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Homologia de Sequência do Ácido Nucleico

19.

Identification and characterization of a novel type III secretion system in trh-positive Vibrio parahaemolyticus strain TH3996 reveal genetic lineage and diversity of pathogenic machinery beyond the species level.

Okada, Natsumi; Iida, Tetsuya; Park, Kwon-Sam; Goto, Naohisa; Yasunaga, Teruo; Hiyoshi, Hirotaka; Matsuda, Shigeaki; Kodama, Toshio; Honda, Takeshi.

Infect Immun ; 77(2): 904-13, 2009 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-19075025

RESUMO

Vibrio parahaemolyticus is a bacterial pathogen causative of food-borne gastroenteritis. Whole-genome sequencing of V. parahaemolyticus strain RIMD2210633, which exhibits Kanagawa phenomenon (KP), revealed the presence of two sets of the genes for the type III secretion system (T3SS) on chromosomes 1 and 2, T3SS1 and T3SS2, respectively. Although T3SS2 of the RIMD2210633 strain is thought to be involved in human pathogenicity, i.e., enterotoxicity, the genes for T3SS2 have not been found in trh-positive (KP-negative) V. parahaemolyticus strains, which are also pathogenic for humans. In the study described here, the DNA region of approximately 100 kb that surrounds the trh gene of a trh-positive V. parahaemolyticus strain, TH3996, was sequenced and its genetic organization determined. This revealed the presence of the genes for a novel T3SS in this region. Animal experiments using the deletion mutant strains of a gene (vscC2) for the novel T3SS apparatus indicated that the T3SS is essential for the enterotoxicity of the TH3996 strain. PCR analysis showed that all the trh-positive V. parahaemolyticus strains tested possess the novel T3SS-related genes. Phylogenetic analysis demonstrated that although the novel T3SS is closely related to T3SS2 of KP-positive V. parahaemolyticus, it belongs to a distinctly different lineage. Furthermore, the two types of T3SS2 lineage are also found among pathogenic Vibrio cholerae non-O1/non-O139 strains. Our findings demonstrate that these two distinct types are distributed not only within a species but also beyond the species level and provide a new insight into the pathogenicity and evolution of Vibrio species.

Assuntos

Proteínas de Bactérias/metabolismo , Variação Genética , Proteínas Hemolisinas/metabolismo , Vibrio parahaemolyticus/classificação , Vibrio parahaemolyticus/metabolismo , Proteínas de Bactérias/genética , Clonagem Molecular , DNA Bacteriano/genética , Regulação Bacteriana da Expressão Gênica/fisiologia , Ilhas Genômicas/genética , Proteínas Hemolisinas/genética , Dados de Sequência Molecular , Vibrio parahaemolyticus/genética , Vibrio parahaemolyticus/patogenicidade , Virulência

20.

Computational search for over-represented 8-mers within the 5'-regulatory regions of 634 mouse testis-specific genes.

Yamashita, Akifumi; Goto, Naohisa; Nishiguchi, Seiji; Shimada, Kazunori; Yamanishi, Hiromichi; Yasunaga, Teruo.

Gene ; 427(1-2): 93-8, 2008 Dec 31.

Artigo em Inglês | MEDLINE | ID: mdl-18817858

RESUMO

Accumulation of microarray data has enabled the computational analysis of gene expressions in various tissues. Although the genes showing testis-specific expression are most abundant among the genes exhibiting tissue-specific expression, no systematic study has been conducted for over-represented motifs within their regulatory regions. We have identified 117 over-represented 8-mers that appeared 2648 times within the regulatory regions of 634 testis-specific genes. Of these, 64 over-represented 8-mers were significantly more frequent in the regulatory regions of testis-specific genes than in those of non-testis-specific genes. In this group of 8-mers, 4 8-mers differed from the canonical cAMP response element (CRE) 8-mer by 1 letter, but the canonical CRE was not included in this group. We consider that CRE-like 8-mers participate in the regulatory expression of testis-specific genes to a greater extent than the canonical CRE 8-mer.

Assuntos

Biologia Computacional/métodos , AMP Cíclico/metabolismo , Perfilação da Expressão Gênica , Testículo/metabolismo , Animais , Masculino , Camundongos , Modelos Biológicos , Modelos Genéticos , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos , Regiões Promotoras Genéticas , Sequências Reguladoras de Ácido Nucleico , Elementos de Resposta , Transcrição Gênica

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA